Audio-based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement

نویسندگان

  • Giannis Karamanolakis
  • Elias Iosif
  • Athanasia Zlatintsi
  • Aggelos Pikrakis
  • Alexandros Potamianos
چکیده

The recent development of Audio-based Distributional Semantic Models (ADSMs) enables the computation of audio and lexical vector representations in a joint acoustic-semantic space. In this work, these joint representations are applied to the problem of automatic tag generation. The predicted tags together with their corresponding acoustic representation are exploited for the construction of acoustic-semantic clip embeddings. The proposed algorithms are evaluated on the task of similarity measurement between music clips. Acoustic-semantic models are shown to outperform the stateof-the-art for this task and produce high quality tags for audio/music clips.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Music Tagging With Time Series Models

We present a system for automatic music annotation that leverages temporal (e.g., rhythmical) aspects as well as timbral content. Our system estimates a dynamic texture mixture (DTM) density over times series of acoustic features (instead of on individual features) for each tag in a semantic vocabulary. When analyzing a new song, our system processes the time series of acoustic features of the ...

متن کامل

Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings

Recently a “Bag-of-Audio-Words” approach was proposed [1] for the combination of lexical features with audio clips in a multimodal semantic representation, i.e., an Audio Distributional Semantic Model (ADSM). An important step towards the creation of ADSMs is the estimation of the semantic distance between clips in the acoustic space, which is especially challenging given the diversity of audio...

متن کامل

Auto-tagging Music Content with Semantic Multinomials

We present a system for automatically associating music content with relevant semantic tags. Our supervised multilabel model (SML) consists of one Gaussian mixture model (GMM) distribution over an audio feature space for each tag in our vocabulary. Using the SML model, we annotate a novel song with a semantic multinomial: a normalized vector of likelihoods for a song’s audio features under each...

متن کامل

Thinkit’s Submissions for Mirex2009 Audio Music Classification and Similarity Tasks

This full abstract describes our submitted systems for the MIREX09 audio classification tasks (genre, mood, classical composer, audio tagging) and music similarity and retrieval task. All the classification systems are based on basic acoustic features (e.g. MFCC) and the modeling framework of GSV-SVM, which has been successfully applied in speaker recognition field. And the similarity systems a...

متن کامل

The Role of Audio and Tags in Music Mood Prediction: A Study Using Semantic Layer Projection

Semantic Layer Projection (SLP) is a method for automatically annotating music tracks according to expressed mood based on audio. We evaluate this method by comparing it to a system that infers the mood of a given track using associated tags only. SLP differs from conventional auto-tagging algorithms in that it maps audio features to a low-dimensional semantic layer congruent with the circumple...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.08391  شماره 

صفحات  -

تاریخ انتشار 2016